Search CORE

14 research outputs found

Multi-Task Domain Adaptation for Deep Learning of Instance Grasping from Simulation

Author: Bai Yunfei
Fang Kuan
Hinterstoisser Stefan
Kalakrishnan Mrinal
Savarese Silvio
Publication venue
Publication date: 03/03/2018
Field of study

Learning-based approaches to robotic manipulation are limited by the scalability of data collection and accessibility of labels. In this paper, we present a multi-task domain adaptation framework for instance grasping in cluttered scenes by utilizing simulated robot experiments. Our neural network takes monocular RGB images and the instance segmentation mask of a specified target object as inputs, and predicts the probability of successfully grasping the specified object for each candidate motor command. The proposed transfer learning framework trains a model for instance grasping in simulation and uses a domain-adversarial loss to transfer the trained model to real robots using indiscriminate grasping data, which is available both in simulation and the real world. We evaluate our model in real-world robot experiments, comparing it with alternative model architectures as well as an indiscriminate grasping baseline.Comment: ICRA 201

arXiv.org e-Print Archive

Crossref

Using Computer Vision To Label And Search A Physical Space

Author: Bokeloh Martin
Heibel Hauke
Hinterstoisser Stefan
Kohler Damon
Sturm Jürgen
Publication venue: Technical Disclosure Commons
Publication date: 07/11/2019
Field of study

Effective operation of a warehouse requires keeping track of the location of various assets within the physical environment. As various sensors are carried through the warehouse environment by operators, range data collected by the sensors over time can be used to reconstruct 2D and 3D representations of the space. This disclosure describes techniques to estimate the locations of Point-Of Interest (POIs) and Regions-Of-Interest (ROIs) within a physical environment such as a warehouse. The location estimates are generated using a combination of 2D visual search of images containing text labels and barcodes, 2D/3D environment reconstruction using sensor data, and estimated trajectory of sensors. Computer vision techniques are applied to visual data which is obtained from operational processes that generate images, such as feeds from stationary cameras, images from moving cameras, photos of the environment, etc

Technical Disclosure Common

Online learning of patch perspective rectification for efficient object detection

Author: Nassir Navab
Pascal Fua
Selim Benhimane
Stefan Hinterstoisser
Vincent Lepetit
Publication venue
Publication date: 01/01/2008
Field of study

For a large class of applications, there is time to train the system. In this paper, we propose a learning-based approach to patch perspective rectification, and show that it is both faster and more reliable than state-of-the-art ad hoc affine region detection methods. Our method performs in three steps. First, a classifier provides for every keypoint not only its identity, but also a first estimate of its transformation. This estimate allows carrying out, in the second step, an accurate perspective rectification using linear predictors. We show that both the classifier and the linear predictors can be trained online, which makes the approach convenient. The last step is a fast verification –made possible by the accurate perspective rectification – of the patch identity and its sub-pixel precision position estimation. We test our approach on real-time 3D object detection and tracking applications. We show that we can use the estimated perspective rectifications to determine the object pose and as a result, we need much fewer correspondences to obtain a precise pose estimation. 1

CiteSeerX

Crossref

Dominant Orientation Templates for Real-Time Detection of Texture-Less Objects

Author: Nassir Navab
Pascal Fua
Slobodan Ilic
Stefan Hinterstoisser
Vincent Lepetit
Publication venue
Publication date: 10/09/2010
Field of study

We present a method for real-time 3D object detection that does not require a time consuming training stage, and can handle untextured objects. At its core, is a novel template representation that is designed to be robust to small image transformations. This robustness based on dominant gradient orientations lets us test only a small subset of all possible pixel locations when parsing the image, and to represent a 3D object with a limited set of templates. We show that together with a binary representation that makes evaluation very fast and a branch-and-bound approach to efficiently scan the image, it can detect untextured objects in complex situations and provide their 3D pose in real-time. 1

CiteSeerX

Crossref

Learning Real-Time Perspective Patch Rectification

Author: Nassir Navab
Pascal Fua
Selim Benhimane
Stefan Hinterstoisser
Vincent Lepetit
Publication venue
Publication date: 01/01/2009
Field of study

We propose two learning-based methods to patch rectification that are faster and more reliable than state-ofthe-art affine region detection methods. Given a reference view of a patch, they can quickly recognize it in new views and accurately estimate the homography between the reference view and the new view. Our methods are more memoryconsuming than affine region detectors, and are in practice currently limited to a few ten patches. However, if the reference image is a fronto-parallel view and the internal parameters known, one single patch is often enough to precisely estimate an object pose. As a result, we can deal in real-time with objects that are significantly less textured than the ones required by state-of-the-art methods. The first method favors fast run-time performance while the second one is designed for fast real-time learning and robustness, however they follow the same general approach: First, a classifier provides for every keypoint a first estimate of its transformation. Then, the estimate allows carrying out an accurate perspective rectification using linear predictors. The last step is a fast verification—made possible by the accurate perspective rectification—of the patch identity and its sub-pixel precision position estimation. We demonstrate th

Infoscience - École polytechnique fédérale de Lausanne

CiteSeerX